Method for determining the position of an object from a digital image

ABSTRACT

Method for determining the position of an object point in a scene from a digital image thereof acquired through an optical system is presented. The image comprises a set of image points corresponding to object points and the position of the object points are determined by means of predetermined vectors associated with the image points. The predetermined vector represents the inverted direction of a light ray in the object space that will produce this image point through the optical system comprising all distortion effects of the optical system.

TECHNICAL FIELD

The present invention generally relates to the determination of objectpositions using camera imaging. The invention more specifically relatesto the determination of the position of objects in a scene from adigital image thereof, especially for use with vehicle occupantprotection systems.

BACKGROUND ART

Automotive occupant protection systems including airbags and passiveseat belt mechanisms are well know in the art, and equip most of thevehicles now produced. The deployment or actuation of such protectionsystems is typically based on acceleration sensors that will output atrigger signal when the sensed acceleration exceeds a predeterminedthreshold.

While the introduction of airbags has proved successful in reducing theseverity of injuries suffered in accidents, they can also, in certaincircumstances, cause serious, sometimes fatal, injuries to vehicleoccupants that are e.g. too small or improperly seated. This may forexample be the case in situations where: a rear-facing infant seat isinstalled in the front seat of a vehicle; the driver is too close to thesteering-wheel; the passenger is too close to the dashboard; or a childis sitting in the front passenger seat. In such cases, it may bepreferable to deactivate the airbag or adapt the airbag deploymentconditions to the occupant.

Therefore, control systems have been developed that take into accountvariations in passenger/occupant conditions, to tailor the airbagdeployment depending on the specific type of occupant present in thevehicle.

A variety of vision-based control systems for detecting the type andposition of vehicle occupants have been proposed. They typicallycomprise at least one camera featuring an optical system (including oneor more lenses) as well as an image sensor array for capturing images ofseat regions. The spatial confinement in the vehicle usually requires awide angle lens to be able to survey the occupant area in the vehicle.The acquired images are processed in the control system to detect andclassify vehicle occupants, and this occupancy classification is thenused as input for airbag control.

An inherent technical problem of vision-based control systems is theoptical distortion of light passing through the camera lens. Forexample, the so-called barrel distortion is a well-known effect observedwith wide-angle lenses. A straight line appears to be curved in theimage due to this distortion, which means that every image point appearsto be displaced.

Therefore, image pre-processing is often used to remove distortion, andthe occupant position is determined from the obtained corrected,undistorted images. This obviously requires a lot of data processing,high computing power and high storage capacity. Furthermore, thedistortion removal processes typically introduce approximations in theposition of objects in the corrected images, which obviously alsoaffects position determination from such corrected images.

U.S. Pat. No. 6,005,958 describes a method and system for detecting thetype and position of a vehicle occupant utilizing a single camera unit,which is capable of determining between objects, forwardly or rearwardlyfacing infant seats, and adult occupants by periodically mapping animage taken of the interior of the vehicle into image profile data, andutilizing image profile matching with stored reference profile data todetermine the occupant or object type. Instantaneous distance is alsomeasured and changes in the measured distances are tracked. All of thisinformation is then used to optimise deployment control of occupantprotection systems.

WO 02/078346 describes a method for correcting an image, in particularfor occupant protection systems, wherein a source image that isdistorted by camera optics is transformed into a corrected image byusing a tabular imaging rule. No, one or several target pixels in thetarget image are assigned to each source pixel of the source image. Thistransformation occurs directly during the reading out from the imagesensor, i.e. in real-time. In this method, the source image istransformed into an undistorted target image of reduced size, whichsaves storage capacity. Such a transformation rule however introducesapproximations, which lead to inaccuracies in position determinationbased on such distortion corrected images.

OBJECT OF THE INVENTION

The object of the present invention is to provide an improved method fordetermining object positions from a digital image. This object isachieved by a method as claimed in claim 1.

GENERAL DESCRIPTION OF THE INVENTION

The present invention proposes a method for determining the position ofan object point (which may generally be a point in an extended object)in a scene from a digital image of this scene acquired through anoptical system, the image comprising an image point corresponding to anobject point. According to an important aspect of the invention, theposition of the object point is determined by means of a predeterminedvector associated with the image point, the predetermined vectorrepresenting the direction of a light ray in the object space that willproduce this image point through the optical system. In other words,such a predetermined vector gives an indication of the direction of theobject point from which the light, producing the image point, hasemerged.

Contrary to known methods for determining object positions, whichinvolve image transformation to provide a corrected, undistorted image,the present method is based on a mapping of the image points backward inthe object space by means of vectors indicating the direction of lightrays actually producing these image points. Such vectors include theoptical distortion through the optical system and thus allow an accurateposition determination of object points in the observed scene withoutany transformation of the original image data. The method is furtheradvantageous in that it does not require any approximations to the imagedata nor additional computing power.

The correct position information obtained by the present method can thenadvantageously be used for occupancy classification in the control ofautomotive occupant protection systems, such as an airbag system.

The predetermined vector associated with an image point preferablyindicates the direction of a light ray passing through the opticalcentre of the lens assembly of the optical system. This gives a univocalindication of the direction of the object point from which the lightray, producing the image point, has emerged.

A distance information is advantageously associated with each imagepoint, which is indicative of the measured remoteness of thecorresponding object point. In such a case, the position of an objectpoint may be determined based on the predetermined vector and thedistance information.

In a preferred embodiment, the optical system comprises a lens assemblyand the image of the observed scene through the optical system isacquired by an image sensor array optically coupled to the lensassembly. The acquired image contains a number of image points thatcorrespond to object points. A distance information is preferablyassociated with each image point. A grey value may conventionally alsobe associated with each image point. A respective predetermined vectoris associated with each image point, representing the direction of alight ray in the object space that will produce this image point on thesensor array after passing through the lens assembly. The predeterminedvector preferably indicates the direction of a light ray passing throughthe optical centre of the lens assembly.

This embodiment thus allows to determine the 3D position of an object inthe observed scene (e.g. passenger compartment) based on thepredetermined vector and distance information. This approach does notrequire the computation of a new corrected image, but permits todirectly derive the 3D position of objects from the uncorrected imageacquired by the sensor, thereby radically differing from prior artsolutions.

The predetermined vectors are advantageously obtained by calibration.The calibration will include a step of identifying for given areas(pixels) of the image sensor the light rays in the object space thatwill respectively fall onto these areas when passing through the opticalsystem. Such calibration procedures are known in the art and need not beexplained in detail herein. Following this identification step, thedirections of these rays are then used to calculate the predeterminedvectors. It will be understood that the predetermined vectors correspondto a given configuration of the optical system and need not besystematically recalculated for each image. The predetermined vectorsmay thus advantageously be memorized or stored.

Each predetermined vector is preferably a unitary vector in a referencecoordinate system. This allows to determine the coordinates of an objectpoint corresponding to an image point by simply multiplying the measureddistance information by the predetermined vector associated with thisimage point. This reference coordinate system may e.g. be the cameracoordinate system, i.e. a three-dimensional coordinate system having itsorigin coinciding with the optical centre of the optical system.

If desired, the vectors, respectively unit vectors, in the referencecoordinate system may be transformed to be used in another coordinatesystem by conventional calculations. For the application of occupantclassification in a vehicle, the preferred coordinate system would bethe vehicle coordinate system rather than the camera coordinate system.

Although the present method is particularly well suited for use inautomotive occupant protection system, it can be used for a variety ofapplications requiring the determination of object positions fromdigital images.

The present method can be implemented to extract position information ofobjects from pictures obtained with a variety of optical systems. For3-D coordinate determination of objects it is advantageous to use acamera acquiring a distance information for each pixel (i.e. each pointof the image). However, the distance information may also be determinedby any appropriate method (e.g. stereoscopy), and afterwards associatedwith the corresponding image point. In this connection, the presentmethod could be applied with images acquired by either of the twocameras of a stereoscopic camera system.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will now be described, by way of example, withreference to the accompanying drawings, in which:

FIG. 1: is a diagram illustrating the principle of the pinhole cameramodel;

FIG. 2: is a diagram illustrating the barrel distortion phenomenon;

FIG. 3: is a diagram illustrating a conventional distortion correctionmethod; and

FIG. 4: is a diagram illustrating the principle of unit vectordetermination.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

A preferred embodiment of the present method will now be explained indetail. The specificities and advantages of the present method willbecome more apparent by comparison to the conventional approach ofcoordinate determination by camera imaging.

The Pinhole Camera Model

The ideal camera (so-called pinhole camera) performs a pure perspectiveprojection of a scene. This projection—used herein for explanatorypurpose—can be described as a simple mapping from the so-called cameracoordinate system r=(x,y,z) to the image coordinate system (u,v) by thesimple relations

u=x·f/z and v=y·f/z.  (1)

Thereby the projection centre is the origin O=(0,0,0) of the cameracoordinate system. The projection plane P spanned is parallel to the(x,y) plane and is displaced with a distanced (the focal length) from Oalong the z-axis (the optical axis).

Ideally, light rays coming from the scene should pass through theoptical centre linearly (FIG. 1). In that case, a point intersecting theprojection plane at position (u,v) will be mapped on a point

(u _(d) ,v _(d))=(u,v)  (2)

in the image. Thereby, the image plane I is assumed to be placed at adistance f from the optical centre behind the camera lens with the axesparallel to the camera coordinate system, i.e. the image plane is justthe mirrored projection plane.

There are 3D-cameras under development that can acquire a distance valued=|r|=√{square root over (x²+y²+z²)} for every pixel. This distance,together with the position (u_(d),v_(d)) of the corresponding pixel onthe image, gives the (x,y,z)-coordinates of the corresponding objectpoint via the equations

z=d·f/√{square root over (f² +u _(d) ² +v _(d) ²)},

x=u _(d) ·z/f,

y=V _(d) ·z/f.  (3),

being just the inversion of equations (1) and (2).

In practice, camera lens systems do not show the ideal, simple, mapping(2) but introduce some non-linear distortion to the optical paths andthus in the resulting image. An example of the so-called barreldistortion that typically occurs for wide angle lenses is shown in FIG.2. Straight lines appear to be curved since the image coordinates(u_(d),v_(d)) are displaced with respect to the corresponding projectionpoints (u,v) and equation (2) becomes

(u _(d) ,v _(d))=F(u,v)  (4).

In addition to a radial distortion, there are other distortions liketangential distortion or distortions due to a displacement of the imagesensor array from the correct position that can all be represented bythe non-linear function F (eq. 4). For a 3D-camera, all kinds of imagedistortions will generate displacements of the calculated (x,y,z)position from the real position due to the displacement of the imagecoordinates (u_(d),v_(d)).

Conventional Distortion Rectification

2D-cameras record a matrix of grey values g_(i) where the matrixconsists of pixels placed in certain fixed positions (a_(i),b_(i)) inthe image plane. 3D-cameras record not only a grey value, but also adistance value at every pixel position.

Conventional image rectification aims at reconstructing an ideal imagefrom the values g_(i) obtained in a distorted image, i.e. an image onewould obtain with a pinhole camera. This image rectification preferablyrequires first a calibration of the camera and involves thedetermination of the function F (eq. 4) that describes the displacement.

This can be done by comparing certain reference pattern(s) with imagesthereof taken from various points of view. There are various calibrationmethods based on well established camera models (see e.g. JanneHeikkilä, Geometric Camera Calibration using Circular Control Points,IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 20,p 1066-1074, (2000), R. Y. Tsai, A Versatile Camera CalibrationTechnique for High Accuracy {3D}Machine Vision Metrology UsingOff-the-Shelf TV Cameras and Lenses, IEEE J. Robotics Automat., pages323-344, Vol. RA-3 (1987)) and need not be described herein in detail.

FIG. 3 shows how the image rectification is conventionally performedonce the calibration function is determined. For every pixel i the pixelposition (a_(i),b_(i)) is ideally mapped backward onto the projectionplane P, i.e. it is mapped backward as if the camera was a pinholecamera. From the projection point (u_(i),v_(i))=(a_(i),b_(i)) thecorresponding displaced position (u_(di),v_(di)) on the image plane iscalculated by using the distortion function (4). The arrow in FIG. 3indicates the displacement between the pixel position (a_(i),b_(i)) andthe displaced image position of the pixel. To construct an ideal image,the 4 pixels around the distorted position (u_(di),v_(di)) aredetermined, and the grey value of pixel i in the ideal image issubstituted by the grey value of the nearest of the four pixels or by alinear combination of the grey values of the four pixels. This processis performed for every pixel yielding a rectified image. A method forperforming this rectification in real time systems has for example beendescribed in WO 02/078346A1.

It is to be noted that a rectified image is, however, always only anapproximation to a non-distorted image due to the procedure describedabove. Either the pixel displacement is only correct up to ½ of a pixelsize, if the nearest neighbour correction is used, or the image issmoothed and has lost contrast.

The same rectification procedure can be applied if the image matrixcontains distance data instead of (or in addition to) grey value data.The (x_(i),y_(i),z_(i)) coordinates computed from the pixel positions(a_(i),b_(i)) and the rectified distance data via formulas (3) will,however, be inaccurate due to the approximations described above.Moreover, in the rectification process the information of some of thecorner pixels will thus be discarded and thus be lost for further imageprocessing.

Distortion-Free Coordinate Determination from 3D-Camera Data Using thePresent Method

The present method proposes determining the position of object points ina scene from a digital image of this scene, by means of predeterminedvectors associated with the image points (one vector per image point),these vectors indicating the directions in the observed scene from whichthe light rays which produce the respective image points, emerge. Theuse of these vectors, which include the optical distortion of thecamera, permits an accurate determination of the position of objects (orobject points) from the digital image.

The present method is particularly well suited for use in occupancyclassification methods for automotive occupant protection systems, suchas airbags, in order to tailor the airbag deployment to the occupanttype and position.

Therefore, the digital image may be acquired by means of a camera systemarranged at a location so as to have a field of view covering the driverand/or the passenger seating areas. Preferably, the camera is of the 3-Dtype that allows to obtain a distance value for each pixel. Such 3-Dcameras are known in the art and need not be described in detail herein.

The camera, which may operate in the visible or preferably the IR range,basically includes an optical system with one or more lenses and animage sensor array operatively coupled to the optical system so that thesensor array translates the observed scene into a two-dimensionaldigital image. The image sensor array may be of the CCD or CMOS type.The acquired digital image is thus composed of a plurality of pixelsthat form the image points reflecting the image of the scene through theoptical system. The distance information acquired for each pixel maye.g. be determined based on time of flight measurement of light.

In practice, a time of flight camera may have one lens operativelycoupled to one image sensor array as well as a modulated light source.The method can also be implemented with a so-called stereo camera. Sucha camera typically has two lenses and two image sensor arrays, and mayfurther have an additional (non-modulated) light source. In the lattercase, the method could be applied to images obtained from either of thetwo lenses.

As will appear from the following description, the present embodiment ofthe method permits an accurate determination of three dimensionalcoordinates of object points from digital images acquired by such a3D-camera, without requiring any transformation of the image. Thisfurther avoids introducing approximations in the images and the need foradditional computing power.

It is to be noted that the digital images (acquired by the camera) neednot be rectified and will thus preferably stay unchanged. Instead, a setof predetermined vectors expressed in a reference three-dimensionalcoordinate system is used. This reference coordinate system preferablyhas its centre used. This reference coordinate system preferably has itscentre coinciding with the optical centre of the camera lens assembly.Furthermore, the predetermined vectors should preferably indicate thedirection of a light ray passing through the centre of the opticalsystem.

Advantageously, these vectors are calculated as unit vectors in areference coordinate system (FIG. 4). This means that the coordinates ofan object point corresponding to a given pixel of the sensor array maysimply be obtained by multiplying the unit vector by the measureddistance.

For each pixel i there shall thus be a determined unit vector e_(i) thatindicates the direction in the object space of a light ray that willfall on that pixel after passing through the optical system. This meansthat the unit vector e_(i) allows projecting back an image point in theprojection plane as shown in FIG. 4, so that, in practice, a unit vectorpreferably indicates the inverted direction of the light ray (as shownin FIG. 4). It will however be understood that what matters is that theunit vector, resp. predetermined vector, be a directing vector of thelight ray (i.e. indicating the global orientation of the light ray).

Although, as mentioned above, a vector may be determined for each pixelof the image sensor array, some regions of the image sensor array may,in practice, not receive relevant information. There is thus no need tocalculate the vectors for the pixels located in these regions of thesensor array.

Determination of the Unit Vectors Comprising Optical Distortion

A prerequisite for the coordinate determination of object points is thecalculation of the set of predetermined vectors. As already explained,this shall be done by calibration, for a given camera configuration andbasically involves the identification, for each pixel of the imagesensor, of the direction of the light rays in the object space that willfall onto the respective pixel.

These directions, are preferably determined using a reciprocaldistortion function F′

(u,v)=F′(u _(d) ,v _(d))  (5).

How this reciprocal distortion function can be obtained within acalibration process is discussed by Heikkilä (see above).

For every pixel i, the reciprocal distortion function will map backwardthe pixel position (a_(i),b_(i)) to its projection point (u_(ci),v_(ci))in the projection plane P (see FIG. 4). This is then exactly the pointin the projection plane which will be mapped onto pixel i by the cameraoptics. Normalizing the point (x_(i),y_(i),z_(i))=(u_(ci),v_(ci),f) tounit length then yields the unit vector pointing from the optical centreto the direction of the projection point, i.e.:

$\begin{matrix}{e_{i} = {\begin{pmatrix}e_{i}^{x} \\e_{i}^{y} \\e_{i}^{z}\end{pmatrix} = {{\frac{1}{\sqrt{u_{ci}^{2} + v_{ci}^{2} + f^{2}}} \cdot \begin{pmatrix}u_{ci} \\v_{ci} \\f\end{pmatrix}}\mspace{14mu} {for}\mspace{14mu} {all}\mspace{14mu} {i.}}}} & (5)\end{matrix}$

Calculating Correct Coordinates Using Unit Vectors

Once the unit vectors are calculated, they can be stored and need not berecalculated during image acquisition. The calculation of the3D-coordinates in the camera coordinate system is then just realized bymultiplying distance value d_(i) measured at pixel i with thecorresponding unit vector e_(i), i.e.

$\begin{matrix}{r_{i} = {\begin{pmatrix}x_{i} \\y_{i} \\z_{i}\end{pmatrix} = {{d_{i} \cdot e_{i}} = {{d_{i} \cdot \begin{pmatrix}e_{i}^{x} \\e_{i}^{y} \\e_{i}^{z}\end{pmatrix}}\mspace{14mu} {for}\mspace{14mu} {all}\mspace{14mu} {i.}}}}} & (6)\end{matrix}$

Since the unit vectors are preferably stored, it is not necessary toperform a new calibration each time the method is implemented, but theunit vectors may simply be retrieved from their storage location. Thisimplies an important gain of computing time and data processing comparedto methods which involve image transformation for positiondetermination.

Coordinate Calculation with Respect to a World Coordinate System.

In the present method, the coordinates of the object points aredetermined in the camera coordinate system. However, if the coordinateswith respect to another coordinate system (so-called “world coordinatesystem”) shall be calculated, one can generate a world unit vectorsystem from the camera unit vector system.

Usually the transformation from one coordinate system to another isrealized by a so-called homogeneous transformation that includes a 3×3rotation matrix R and a translation vector T. Applying the rotationmatrix R to each of the unit vectors e_(i) of the camera coordinatesystem yields the unit vectors e_(i) ^(w) in the world coordinatesystem, i.e.

e _(i) ^(w) =R·e _(i) for all i.  (7)

Again, these unit vectors of the world coordinate system have only to becomputed once and can be kept in memory during runtime. The coordinatecalculation during runtime is in this case realized by the simpleformula

$\begin{matrix}{r_{i}^{w} = {\begin{pmatrix}x_{i}^{w} \\y_{i}^{w} \\z_{i}^{w}\end{pmatrix} = {{d_{i} \cdot e_{i}^{w}} + {T\mspace{14mu} {for}\mspace{14mu} {all}\mspace{14mu} {i.}}}}} & (8)\end{matrix}$

1. Method for determining the position of an object point in a scenefrom a digital image of said scene acquired through an optical system,said image comprising an image point corresponding to said object point,wherein the position of said object point is determined by means of apredetermined vector associated with said image point, saidpredetermined vector representing the direction of a light ray in theobject space that will produce this image point through said opticalsystem.
 2. Method according to claim 1, wherein said source imagecomprises a number of image points corresponding to respective objectpoints, and wherein a respective predetermined vector is associated witheach of said image points.
 3. Method according to claim 1, wherein eachpredetermined vector indicates the direction of a light ray passingthrough the optical centre of the lens assembly of said optical system.4. Method according to claim 1, wherein each predetermined vector isdetermined by calibration.
 5. Method according to claim 1, whereindistance information is associated with an image point, and the positionof the corresponding object point is determined based on thepredetermined vector and said distance information associated with saidimage point.
 6. Method according to claim 5, wherein measured distanceinformation is associated with each image point, said distanceinformation being indicative of the measured remoteness of thecorresponding object point.
 7. Method according to claim 6, wherein eachpredetermined vector is a unitary vector in a reference coordinatesystem; and the coordinates of an object point corresponding to an imagepoint are determined by multiplying the measured distance information bythe predetermined vector associated with this image point.
 8. Methodaccording to claim 7, wherein said measured distance information isindicative of the distance from the centre of said optical system to thecorresponding object point.
 9. Method according to claim 1, wherein saiddigital image is acquired by means of an image sensor array.
 10. Methodaccording to claim 1, wherein a grey value is associated with each imagepoint.
 11. Method according to claim 1, wherein the coordinates of saidpredetermined vectors expressed in reference coordinate system aretransformed in another coordinate system.
 12. Use of the methodaccording to claim 1 in a method for detecting and classifying vehicleoccupants of an automotive occupant protection system.
 13. Methodaccording to claim 5, wherein each predetermined vector indicates thedirection of a light ray passing through the optical centre of the lensassembly of said optical system.